Weighted Likelihood Policy Search with Model Selection

نویسندگان

Tsuyoshi Ueno

Kohei Hayashi

Takashi Washio

Yoshinobu Kawahara

چکیده

Reinforcement learning (RL) methods based on direct policy search (DPS) have been actively discussed to achieve an efficient approach to complicated Markov decision processes (MDPs). Although they have brought much progress in practical applications of RL, there still remains an unsolved problem in DPS related to model selection for the policy. In this paper, we propose a novel DPS method, weighted likelihood policy search (WLPS), where a policy is efficiently learned through the weighted likelihood estimation. WLPS naturally connects DPS to the statistical inference problem and thus various sophisticated techniques in statistics can be applied to DPS problems directly. Hence, by following the idea of the information criterion, we develop a new measurement for model comparison in DPS based on the weighted log-likelihood.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Factorized Asymptotic Bayesian Policy Search for POMDPs

This paper proposes a novel direct policy search (DPS) method with model selection for partially observed Markov decision processes (POMDPs). DPSs have been standard for learning POMDPs due to their computational efficiency and natural ability to maximize total rewards. An important open challenge for the best use of DPS methods is model selection, i.e., determination of the proper dimensionali...

متن کامل

A Multi-Objective Particle Swarm Optimization Algorithm for a Possibilistic Open Shop Problem to Minimize Weighted Mean Tardiness and Weighted Mean Completion Times

We consider an open shop scheduling problem. At first, a bi-objective possibilistic mixed-integer programming formulation is developed. The inherent uncertainty in processing times and due dates as fuzzy parameters, machine-dependent setup times and removal times are the special features of this model. The considered bi-objectives are to minimize the weighted mean tardiness and weighted mean co...

متن کامل

A Bayesian Nominal Regression Model with Random Effects for Analysing Tehran Labor Force Survey Data

Large survey data are often accompanied by sampling weights that reflect the inequality probabilities for selecting samples in complex sampling. Sampling weights act as an expansion factor that, by scaling the subjects, turns the sample into a representative of the community. The quasi-maximum likelihood method is one of the approaches for considering sampling weights in the frequentist framewo...

متن کامل

High-dimensional classification by sparse logistic regression

We consider high-dimensional binary classification by sparse logistic regression. We propose a model/feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the non-asymptotic bounds for the resulting misclassification excess risk. The bounds can be reduced under the additional low-noise condition. The proposed complexity penalty ...

متن کامل

Two-stage Model Selection with Parameters Weighted Hidden Markov Models and Likelihood Ratio for Part-of-speech Tagging

In many natural language processing applications two or more models usually have to be involved for accuracy. But it is difficult for minor models, such as “backoff” taggers in part-of-speech tagging, to cooperate smoothly with the major probabilistic model. We introduce a two-stage approach for model selection between hidden Markov models and other minor models. In the first stage, the major m...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Weighted Likelihood Policy Search with Model Selection

نویسندگان

چکیده

منابع مشابه

Factorized Asymptotic Bayesian Policy Search for POMDPs

A Multi-Objective Particle Swarm Optimization Algorithm for a Possibilistic Open Shop Problem to Minimize Weighted Mean Tardiness and Weighted Mean Completion Times

A Bayesian Nominal Regression Model with Random Effects for Analysing Tehran Labor Force Survey Data

High-dimensional classification by sparse logistic regression

Two-stage Model Selection with Parameters Weighted Hidden Markov Models and Likelihood Ratio for Part-of-speech Tagging

عنوان ژورنال:

اشتراک گذاری